reasoning procedure
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Analyzing Task-Encoding Tokens in Large Language Models
Bai, Yu, Huang, Heyan, Piano, Cesare Spinoso-Di, Rondeau, Marc-Antoine, Chen, Sanxing, Gao, Yang, Cheung, Jackie Chi Kit
In-context learning (ICL) has become an effective solution for few-shot learning in natural language processing. Past work has found that, during this process, representations of the last prompt token are utilized to store task reasoning procedures, thereby explaining the working mechanism of in-context learning. In this paper, we seek to locate and analyze other task-encoding tokens whose representations store task reasoning procedures. Supported by experiments that ablate the representations of different token types, we find that template and stopword tokens are the most prone to be task-encoding tokens. In addition, we demonstrate experimentally that lexical cues, repetition, and text formats are the main distinguishing characteristics of these tokens. Our work provides additional insights into how large language models (LLMs) leverage task reasoning procedures in ICL and suggests that future work may involve using task-encoding tokens to improve the computational efficiency of LLMs at inference time and their ability to handle long sequences.
- North America > United States > New York (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (2 more...)
Interpretable Deep Tracking
Thérien, Benjamin, Czarnecki, Krzysztof
Imagine experiencing a crash as the passenger of an autonomous vehicle. Wouldn't you want to know why it happened? Current end-to-end optimizable deep neural networks (DNNs) in 3D detection, multi-object tracking, and motion forecasting provide little to no explanations about how they make their decisions. To help bridge this gap, we design an end-to-end optimizable multi-object tracking architecture and training protocol inspired by the recently proposed method of interchange intervention training (IIT). By enumerating different tracking decisions and associated reasoning procedures, we can train individual networks to reason about the possible decisions via IIT. Each network's decisions can be explained by the high-level structural causal model (SCM) it is trained in alignment with. Moreover, our proposed model learns to rank these outcomes, leveraging the promise of deep learning in end-to-end training, while being inherently interpretable.
- North America > United States > California > Los Angeles County > Long Beach (0.05)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- (5 more...)
- Transportation > Ground > Road (0.70)
- Transportation > Passenger (0.49)
Kazakov
The EL family of description logics (DLs) has been designed to provide a restricted syntax for commonly used DL constructors with the goal to guarantee polynomial complexity of reasoning. Yet, polynomial complexity does not always mean that the underlying reasoning procedure is efficient inpractice. In this paper we consider a simple DL ELO from the EL family that admits nominals, and argue that existing polynomial reasoning procedures for ELO can be impractical for many realistic ontologies. To solve the problem, we describe an optimization strategy in which the inference rules required for reasoning with nominals are avoided as much as possible. The optimized procedure is evaluated within the reasoner ELK and demonstrated to perform well in practice.
Reinforced Dynamic Reasoning for Conversational Question Generation
Pan, Boyuan, Li, Hao, Yao, Ziyu, Cai, Deng, Sun, Huan
This paper investigates a new task named Conversational Question Generation (CQG) which is to generate a question based on a passage and a conversation history (i.e., previous turns of question-answer pairs). CQG is a crucial task for developing intelligent agents that can drive question-answering style conversations or test user understanding of a given passage. Towards that end, we propose a new approach named Reinforced Dynamic Reasoning (ReDR) network, which is based on the general encoder-decoder framework but incorporates a reasoning procedure in a dynamic manner to better understand what has been asked and what to ask next about the passage. To encourage producing meaningful questions, we leverage a popular question answering (QA) model to provide feedback and fine-tune the question generator using a reinforcement learning mechanism. Empirical results on the recently released CoQA dataset demonstrate the effectiveness of our method in comparison with various baselines and model variants. Moreover, to show the applicability of our method, we also apply it to create multi-turn question-answering conversations for passages in SQuAD.
- North America > United States > Ohio (0.05)
- Asia > China (0.04)
Chain of Reasoning for Visual Question Answering
Wu, Chenfei, Liu, Jinlai, Wang, Xiaojie, Dong, Xuan
Reasoning plays an essential role in Visual Question Answering (VQA). Multi-step and dynamic reasoning is often necessary for answering complex questions. For example, a question "What is placed next to the bus on the right of the picture?" talks about a compound object "bus on the right," which is generated by the relation
Chain of Reasoning for Visual Question Answering
Wu, Chenfei, Liu, Jinlai, Wang, Xiaojie, Dong, Xuan
Reasoning plays an essential role in Visual Question Answering (VQA). Multi-step and dynamic reasoning is often necessary for answering complex questions. For example, a question "What is placed next to the bus on the right of the picture?" talks about a compound object "bus on the right," which is generated by the relation
Consequence-Driven Reasoning for Horn SHIQ Ontologies
Kazakov, Yevgeny (Oxford University)
We present a novel reasoning procedure for Horn SHIQ ontologies—SHIQ ontologies that can be translated to the Horn fragment of first-order logic. In contrast to traditional reasoning procedures for ontologies, our procedure does not build models or model representations, but works by deriving new consequent axioms. The procedure is closely related to the so-called completion-based procedure for EL++ ontologies, and can be regarded as an extension thereof. In fact, our procedure is theoretically optimal for Horn SHIQ ontologies as well as for the common fragment of EL++ and SHIQ. A preliminary empirical evaluation of our procedure on large medical ontologies demonstrates a dramatic improvement over existing ontology reasoners. Specifically, our implementation allows the classification of the largest available OWL version of Galen. To the best of our knowledge no other reasoner is able to classify this ontology.